Experiments in Automated Lexicon Building for Text Searching

نویسندگان

  • Barry Schiffman
  • Kathleen McKeown
چکیده

This paper describes experiment's in the automat'ic cons t ruc t ion of lexicons tha t would be useflfl in searching large document collect'ions tot text frag~ ments tinct address a specific inibrmation need, such as an answer to a quest ' ion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Bilingual Lexicon Using Phrase-based Statistical Machine Translation via a Pivot Language

This paper proposes a novel method for building a bilingual lexicon through a pivot language by using phrase-based statistical machine translation (SMT). Given two bilingual lexicons between language pairs Lf–Lp and Lp–Le, we assume these lexicons as parallel corpora. Then, we merge the extracted two phrase tables into one phrase table between Lf and Le. Finally, we construct a phrase-based SMT...

متن کامل

Semi-Supervised Acquisition of a Spanish Lexicon from a Portuguese Seed Lexicon

This paper deals with the automated acquisition of a Spanish medical subword lexicon from an already existing Portuguese seed lexicon. Using two nonparallel monolingual corpora we determine Spanish lexeme candidates from Portuguese seed lexicon entries by heuristic cognate mapping. We are still working on the experiments and trying to achieve a good method for validating the translation hypothe...

متن کامل

Automated Text Analysis Based on Skip-Gram Model for Food Evaluation in Predicting Consumer Acceptance

The purpose of this paper is to evaluate food taste, smell, and characteristics from consumers' online reviews. Several studies in food sensory evaluation have been presented for consumer acceptance. However, these studies need taste descriptive word lexicon, and they are not suitable for analyzing large number of evaluators to predict consumer acceptance. In this paper, an automated text analy...

متن کامل

Automatic Valency Derivation for Related Languages

This paper describes an experiment combining several existing data resources (parallel corpora, valency lexicon, morphological taggers, bilingual dictionary etc.) and exploiting them in a task of building a valency lexicon for a related language (Russian) derived from a high quality manually created valency lexicon for Czech (Vallex) containing several thousands of verbs with very rich syntacti...

متن کامل

A Linguistic Analysis of Conference Titles in Applied Linguistics

Over the past twenty-five years, researchers have expressed considerable interest in titles of academic publications. Unfortunately, conference paper titles (CPTs) have only recently begun to receive attention. The aim of this study, therefore, is to investigate the text length, syntactic structure, and lexicon of CPTs in Applied Linguistics. A data set of 698 titles was selected from the 2008 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000